首页> 外文OA文献 >Odysseus/DFS: Integration of DBMS and Distributed File System for Transaction Processing of Big Data
【2h】

Odysseus/DFS: Integration of DBMS and Distributed File System for Transaction Processing of Big Data

机译:Odysseus / DFs:DBms与分布式文件系统的集成   大数据的事务处理

摘要

The relational DBMS (RDBMS) has been widely used since it supports varioushigh-level functionalities such as SQL, schemas, indexes, and transactions thatdo not exist in the O/S file system. But, a recent advent of big datatechnology facilitates development of new systems that sacrifice the DBMSfunctionality in order to efficiently manage large-scale data. Those so-calledNoSQL systems use a distributed file system, which support scalability andreliability. They support scalability of the system by storing data into alarge number of low-cost commodity hardware and support reliability by storingthe data in replica. However, they have a drawback that they do not adequatelysupport high-level DBMS functionality. In this paper, we propose anarchitecture of a DBMS that uses the DFS as storage. With this novelarchitecture, the DBMS is capable of supporting scalability and reliability ofthe DFS as well as high-level functionality of DBMS. Thus, a DBMS can utilize avirtually unlimited storage space provided by the DFS, rendering it to besuitable for big data analytics. As part of the architecture of the DBMS, wepropose the notion of the meta DFS file, which allows the DBMS to use the DFSas the storage, and an efficient transaction management method includingrecovery and concurrency control. We implement this architecture inOdysseus/DFS, an integration of the Odysseus relational DBMS, that has beenbeing developed at KAIST for over 24 years, with the DFS. Our experiments ontransaction processing show that, due to the high-level functionality ofOdysseus/DFS, it outperforms Hbase, which is a representative open-source NoSQLsystem. We also show that, compared with an RDBMS with local storage, theperformance of Odysseus/DFS is comparable or marginally degraded, showing thatthe overhead of Odysseus/DFS for supporting scalability by using the DFS as thestorage is not significant.
机译:由于关系DBMS(RDBMS)支持O / S文件系统中不存在的各种高级功能,例如SQL,模式,索引和事务,因此已被广泛使用。但是,近来大数据技术的出现促进了新系统的开发,而这些新系统为了有效管理大规模数据而牺牲了DBMS的功能。那些所谓的NoSQL系统使用分布式文件系统,该系统支持可伸缩性和可靠性。它们通过将数据存储到大量的低成本商品硬件中来支持系统的可伸缩性,并通过将数据存储在副本中来支持可靠性。但是,它们的缺点是不能充分支持高级DBMS功能。在本文中,我们提出了使用DFS作为存储的DBMS的体系结构。通过这种新颖的体系结构,DBMS能够支持DFS的可伸缩性和可靠性以及DBMS的高级功能。因此,DBMS可以利用DFS提供的几乎无限的存储空间,使其适合大数据分析。作为DBMS体系结构的一部分,我们提出了元DFS文件的概念,该概念允许DBMS使用DFS作为存储,以及一种有效的事务管理方法,包括恢复和并发控制。我们在Odysseus / DFS中实现了此体系结构,它是Odysseus关系DBMS的集成,该结构已在KAIST与DFS一起开发了24年以上。我们的事务处理实验表明,由于Odysseus / DFS的高级功能,它的性能优于HBase,后者是代表性的开源NoSQL系统。我们还表明,与具有本地存储的RDBMS相比,Odysseus / DFS的性能相当或略有下降,表明Odysseus / DFS通过使用DFS作为存储来支持可伸缩性的开销并不重要。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号